Adaptive canceller filter module

ABSTRACT

An adaptive canceller filter module having signal sensors and signal filters in a circuit for use with filtered-x algorithms to adapt the coefficients of one of said filters to minimize the measure of the error.

This invention is concerned with an LMS filter chip that can be used asa building block in low cost noise cancellation applications.

SUMMARY OF THE INVENTION

Prior art adaptive canceller filters have consisted of software code indigital computers, often in digital signal processor (DSP)microprocessors such as the TMS320C25 by Texas Instrument or as discretehardware components in analog systems. The software--DSP methods lackspeed while the analog methods contain an excess of parts and lackadaptability.

INMOS has produced a multiply-accumulate chip, the IMS A100, that can beused to implement digital filters but no method is provided foradaptation of the filter coefficients.

The current invention overcomes the prior an limitations while providinga standard adaptive filter module that can operate at high speed,reduces hardware parts count and will lead to reduced cost from volumeproduction.

It works with the Digital Virtual Earth (DVE) algorithm and the AFFalgorithm which achieves noise cancellation without requiring either anoise reference or a sync signal as described in U.S. Pat. No. 5,105,377which is hereby incorporated by reference herein. It also works with theFiltered-x class of algorithms which utilize a reference signal andadapt an FIR filter to minimize the power of an error signal.

Accordingly, it is an object of this invention to provide a standardadaptive filter module for FIR or IIR filters that can operate at highspeed.

Another object is to provide the architecture for a filtered-x adaptivefilter module.

A still further object of this invention is to provide an adaptivefilter module that will reduce the amount of hardware in active noisecancellation apparatus.

Furthermore, it is an object of this invention to provide a novelfiltered -x adaptive filter for enhancing the operation of the digitalvirtual earth (DVE) algorithm.

These and other objects of this invention will become apparent whenreference is had to the accompanying description and drawings.

FIG. 1 is a simplified block diagram of the DVE algorithm,

FIG. 2 is a simplified block diagram of a filtered-x adaptive canceller.

FIG. 3 is a block diagram of the method for measuring the externalimpulse response.

FIG. 4 shows a block diagram of the module configured so as to have thefilter and adapter as a single module.

FIG. 5 is a chart of the system identification and calibration.

FIG. 6 is a diagram of the adaptive canceller filter module, and

FIG. 7 is a diagram of the architecture of the filter chip.

THE DVE ALGORITHM/FILTERED-X ALGORITHM Description

The Digital Virtual Earth (DVE) algorithm holds the promise of a genericsolution to active cancellation without requiring either noise referenceor a sync signal. To date it has shown promise in applications whichhave either a relatively stationary noise signature or a simple noisespectrum.

DVE works by first estimating the original noise from the measurement ofthe residual and knowledge of its anti-noise output as modified by thesystem transfer function "C". This estimated noise is passed through anadaptive filter to then produce the anti-noise. The Filtered-X LMSadaptation of the filter then minimizes the residual signal power. FIG.1 shows a simplified block diagram of the DVE algorithm where

"C"--is the system transfer function, that is, the system's effect onthe anti-noise as it flows through the D/A (and its reconstructionfilter), through the anti-noise driver to the residual microphone,through the A/D (and its anti-aliasing filter).

"C" --An estimate of "C" established by "calibration" and copies intotwo physical filters

Filtered-X: This filter makes certain that the LMS updates are doneusing consistent measures. The reference signal (X--the estimated noisesignal) is passed through a copy of "C" since the anti-noise signal hasgone through the real "C"

Anti-Noise: This filter is used to estimate the actual anti-noisedelivered. The anti-noise estimate is then subtracted from the residualto estimate the original noise.

"A"--The adaptive equalizer. This filter adapts to minimize the residualsignal power.

This system achieves cancellation when the adaptive equalizer, "A", hasa transfer function which is the inverse of C₁ (the first half of "C",including the D/A, the reconstruction filter, the anti-noise driver andthe transit delay to the mike). Since "A" must be causal, this inversefunction must cycle slip tonal components to achieve the correct phase.

The highest frequency that can be canceled is determined by the samplerate. Depending on the desired performance, the sampling frequency mustbe 4 to 6 times the highest frequency to be processed.

The sampling rate and filter length determine the number of filterweights in "A" Since these weights are LMS adapted and their adaptationsinteract with each other, the more complex the noise, the longeradaptation will take. Noise complexity is a function of:

The number of noise components.

The spacing between noise components (closer spacing-->highercomplexity).

The closest spacing of tonal components for which a steady statesolution exists is determined by the frequency resolution of the filterwhich is computed by dividing the sample rate by the filter length.Frequencies that are closer together than this spacing also can becanceled by continuously adapting "A" at the difference frequencybetween the tones. This is, of course, only possible when the overallnoise complexity allows adaptation at a rate consistent with thedifference frequency.

A Non Adaptive DVE (Low Cost Cancellation)

Since the solution DVE finds is a global solution, independent of thespecific noise levels, frequencies and phase, there is an opportunity todevelop a low cost nonadaptive version of DVE. It is appropriate whenthe transfer function "C" does not vary (as in open-back headsets) andthe minimum noise component spacing is predictable, In this solutioncalibration becomes a two step operation:

Find "C"--This is a current calibrate procedure. A noise source issynthesized and passed through the external system. The result iscorrelated with the input to determine the "impulse response". (Thefrequency domain transform of this impulse response is the transferfunction).

Determine "A"--Take the inverse transform of "C₂ " divided by "C" where"C₂ " is the known transfer function of the anti-aliasing filterassociated with the A/D converter.

The Motorola 56200 Description

The Motorola 56200 Adaptive Finite Impulse Response (FIR) Filter chip isdesignated to do most of the processing steps in adaptive feedforwardcancellation when there are no delays in the plant. It was developed tocancel echoes in speaker phones and has been applied to otherapplications as well (e.g. the Toshiba quiet refrigerator). It iscapable of processing a 256 tap FIR filter at a sample rate of over 200KHZ and LMS adapting the filter weights in real time if the sample ratedrops to less than 19 KHZ. 56200 chips can be cascaded to provideadaptive filters of almost any length and speed. This is a costeffective means to achieve a high performance DVE system. Its onlyproblem is that since its primary application was "in-wire ", there areno provisions in the chip to do Filtered-X.

FILTERED-X

The 56200 passes the reference signal "X" through its 256 sample delayline. These delayed samples (16 bit) are multiplied by the correctweights (24 bit) and accumulated in a 40 bit adder to produce the filteroutput. The SAME samples are also used to calculate the LMS adaptationof the filter weights leaving no opportunity to use an equalized versionof X for robust adaptation. Two solutions to providing for Filtered-Xoperation have been proposed.

The first solution is to use separate the adaptation calculations fromthe filter calculations. This straight forward solution has a cost ofdoubling the data memory requirement but then allows storage of both Xand Filtered-X. It can also be done by using separate 56200s for eachfunction and regularly copying the weights from the adaptive chip to thefilter chip.

An alternative solution is to add a fixed equalizer at the output of the56200 (The inverse of C similar to the solution in the low-costapproach). The filter needed for Filtered-X then becomes a fixed delaywhich can be implemented as an offset register between the weight updatecalculation times and the filter weight times they change. This has anoverall lower cost than the first solution.

DESCRIPTION OF THE INVENTION

The basic adaptive filter constructs are described by Widrow. A widelyapplicable construct is the filtered-x method illustrated in blockdiagram form in FIG. 2. This method consists of a reference signal,called x, which is passed through a filter A, often an FIR or IRRfilter, to produce a cancellation signal. The reference signal isfurther filtered by a filter C, again often an FIR filter, the impulseresponse of which models the phase characteristics of the externalenvironment of the error path, sometimes referred to as the "plant". Thefiltering of the reference signal x led to the name "filtered-x". Theoutput of filter C and the error output of the plant are utilized toadapt the coefficients of filter A to minimize a measure of the error,often the average power.

This general technique is used for system identification to measure theexternal impulse response, filter C, as well as to generate and adaptthe cancellation signal. FIG. 3 shows a block diagram of the method formeasuring the external impulse response.

In the current invention the convolution performed by filter C and theadapter are integrated in a single module. This basic module can havewide applicability to adaptive filters and active cancellation, thusreducing the time to produce new systems and reducing cost from largescale manufacturing. FIG. 4 shows this module as configured in thesystem, identification operation of FIG. 3.

The invention consists of a first delay line, DL1; a vector of filtercoefficients, C; an adaptor and a second delay line, DL 2. The inventionoperates at discrete time steps. During each step the followingoperations take place:

1. The contents of the first delay line are shifted one place,

2. The current input to the first delay line is placed in the initialposition in delay line,

3. The current values of the filter coefficients are convolved with thefirst delay line producing an output value Y_(k) at step k according to##EQU1## where DL1_(n) is the n'th entry in the delay line DL1,

C_(n) is the n'th coefficient in C and

n ranges over the number of entries in the delay line DL1 and thecoefficients in C.

4. The contents of the second delay line are shifted one place,

5. The current input to the second delay line is placed in the initialposition in the delay line,

6. The current values of the filter coefficients are adapted accordingto the adaptation algorithm using the current values of the filtercoefficients, the contents of the second delay line and an adaptationrate item α to assure convergence.

In a preferred implementation an LMS algorithm is used for adaptation.The filter coefficients are then adapted according to the following:

    C.sub.n,k +1+C.sub.n,k +αe.sub.k DL2.sub.n

where:

C_(n),k is the n'th coefficient being adapted at step k,

e_(k) is the value of the "error" input to the module at step k,

DL_(2n) is the n'th entry in the delay line DL2 and

α is set for convergence.

In a particularly preferred implementation, α will be stored in aregister within the module that can be updated externally.

Alternatively, the first delay line can consist of the partial sums fromthe convolution rather than the delayed input data, as described byINMOS in the documentation for the IMS A100. The errors introduced bythis transformation when the coefficients are not constant tend to besmall.

FIG. 5 shows the filtered-x canceller plus the system identificationfunction configured from only two of the invention modules.

In a given manifestation of the invention a length will be selected forthe delay lines and number of filter coefficients, such as 32 forexample. The invention provides optional cascading of the modules toobtain longer filters when desired. The output of delay lines DL1 andDL2 can be provided to the inputs of the corresponding delay lines insuccessive modules. Also, cascading can be facilitated by providing asumming input to each module, the value of which is added to theconvolution result. In this manner the convolution output of one stagecan be added to the convolution output of the next stage to provide theeffect of a longer filter.

FIG. 6 shows a diagram of the invention including provisions forcascading multiple modules.

There are three independent numerical processes in the Filtered X LMSalgorithm described. Specifically, ##EQU2##

where X^(a) and X^(b) are delay line 1 and delay line 2 as described.

As can be seen this can be accomplished with only adders andmultipliers. This limits the type of computational resources required toperform this task. To keep the cost reasonable one should reduce thenumber of the resources while maintaining reasonable throughput. Look ata simple vector pipeline to process a convolution. This can be builtwith one multiplier, one adder, an accumulator, and two vector registerfiles. We can make use of the parallel nature of this computation byoverlapping the multiplications and additions. A time-space diagram ofthe process shows that both the adder and multiplier are in use at everyclock tick except the first where only the multiplier is being used.##STR1##

The number in the block is essentially the index i in the summation. Asyou can see these resources are utilized (almost) 100% of the time. Torelate this to throughput we need to find an expression for the numberof clock ticks T_(n) required to calculate a n point convolution. Thisis easily shown to be:

    T.sub.n =(n-1)+2=n+1

The throughput is simply ##EQU3##

If the clock cycle τ is 100 nS then the throughput for a 32 tap filteris 303 KHZ.

We can now show that we can maintain 100 KHZ throughput for all of theequations without adding any computational resources. We will assumethat the cycle time τ is 100 nS and that the expression 2 με isavailable without any additional processing. A time-space map of theresource utilization to complete one set of the equations (one of themultiples and accumulates for the index i) looks like ##STR2##

In this diagram, the A's are the weight updates, the B's are theconvolution, and the C's are the cross-correlation. They are arranged inthis way because the convolution is dependent on the result of theweight update equation. The cross-correlation calculation is used tofill the gap and maintain the full utilization of the processingresources. Clearly, these operations can be chained as we did for theconvolution and the resulting throughput is ##EQU4##

and is 101 KHZ for a 32 tap filter.

The problem actually becomes more complicated if the cross-correlationis not calculated.

For this case the time-space diagram is ##STR3##

and there are gaps in the middle. Full utilization can be maintained ifdelays are introduced between the issue of consecutive instructions, asbelow ##STR4##

The number indicates the ith instruction issued. The second instructionwas issued one tick after the first but one must wait three clock ticksbefore one can issue the third instruction. In this way maximumthroughput can be maintained which in this case would be 202 KHZ for a32 tap filter.

INTERNAL ARCHITECTURE

The goal can be accomplished with only one multiplier and one adder.Three vector registers, one to hold the coefficients and the two for thedelay lines are also needed. There will be an index register and alength register (one for all three) associated with each vector registerto control their access. Two accumulators are necessary to handle theparallel summations of equations 2 and 3. This architecture is shown inFIG. 7 as an FIR filter system.

Note there are also two general purpose registers that are used to holdthe values of μ and ε.

VECTOR REGISTERS

Each vector register is capable of holding a vector of length M. M couldbe as large as 256 or larger if this is practical for the chipmanufacturer. The length of the vector used at run time is controlled bythe LENGTH register. This register allows modulo N indexing of thevectors where N is the length of the summation. The actual indexing isperformed by an index register associated with each vector.

Each indexing unit will provide the facilities to increment (decrement)the index register. This gives the flexibility required to performcircular convolutions without actually having to shift the data throughthe vector register. At the beginning of an update cycle (which consistsof the weight update, convolution, and cross-correlation) the indexregisters will point to the oldest values in the delay lines and thecoefficient associated with time τ=0. The first step is to write thenewest data over the oldest data in the delay lines. Their indexregisters are then incremented so that what was the N-1^(th) element inthe delay line is now the N^(th) as shown in the figure. ##STR5##

This way the delay lines are completely dynamic without the need toactually move data.

COMPUTE ENGINE

The compute engine consists of the multiplier and adder plus thesupporting elements that allow the computation to take place.

Data is routed from the vector registers to the compute engine on a 72bit bus or similar bus. This bus simultaneously carries the three 24 bitwords of the coefficients and delay lines. The operands to themultiplier are provided from two multiplexers which are connected asshown in the block diagram. Note that in addition to the vectorregisters the multiplexer can also select two scalar registers μ and ε.These allow the operations required for the weight update. Initially, μis multiplied by the error ε. The results are stored back into ε forfuture use. One of the operands for the adder comes from the multiplier.The other comes from a multiplexer and allows for each of the threeequations to be performed. Either the weights can be selected to performthe weight update or one of two accumulators to accommodate the twosummations.

EXTERNAL ARCHITECTURE

The external architecture of the chip is difficult to specify withoutthe aid of an actual silicone manufacturer. There are a number ofoperations that must be performed.

Read the latest value for delay A

Read the latest value for delay B

Read the error ε

Output the result of the convolution

Output the result of the cross-correlation

Write/Read the current values of the coefficients

Cascade the delay lines X^(a) and X^(b)

Enable/Disable weight update

Accept a new value of μ

Accept a value for the length of the vectors

Clearly the last two can be set up as a standard microprocessorinterface. Provision can be made to bootstrap appropriate values. Twopins to select from default lengths and reading a port on power-up aresuitable.

Provision is also made so that multiple chips can transfer theircoefficients. A chip that is being used for plant identification canbroadcast its results to one or two other chips. This can beaccomplished with the aid of a microprocessor or independently. Ineither event the possibility of serial communications should not beoverlooked since it saves the cost of many pins.

The first five items on the list can be accomplished in a number ofways. A microprocessor interface is one alternative. Directcommunication is another. One interesting possibility is the use of A/Dand D/A converters. Sigma-delta technology is becoming more and morecommon in mixed mode applications. If it is possible to integrate theseonto the chip in a cost effective manner then the external interfacingbecomes a simple matter of summing and buffering with op-amps.

Changes and modifications will occur to those of ordinary skill in theart without departing from the scope of the following claims.

We claim:
 1. An adaptive canceller filter module means for use withfiltered-X algorithms in active noise cancellation systems, said modulemeans comprising:a first signal sensor means to receive a referencesignal; a first filter means adapted to produce a cancellation signal inresponse to said reference signal; a second filter means adapted tofilter the reference signal produced by the first filter means, theimpulse response of which models at least the phase characteristics ofthe external environment and produces an output signal; second signalsensor means to receive an error signal; and adaptive means using anadaptive output signal of said second filter means and said error signalto adapt the coefficients of said first filter means to minimize ameasure of error.
 2. An adaptive canceller filter module means as inclaim 1, wherein the second filter means and adaptive means areintegrated into a single unit.
 3. An adaptive canceller filter modulemeans as in claim 1, including a plant error signal generating meansadapted to receive input from said signal sensor means and produce aplant error signal which is fed to said adaptive means.
 4. An adaptivecanceller filter module means as in claim 1, including a noisegenerating means adapted to produce said cancellation signal inconjunction with said first filter means.
 5. An adaptive cancellerfilter module means as in claim 4, wherein the second filter means andthe adaptive means consist of one integrated adaptive filter means. 6.An adaptive canceller filter module means as in claim 1, wherein saidfirst filter means is a FIR filter and said second filter means is a FIRfilter.
 7. An adaptive canceller filter module means as in claim 6,wherein said first filter means has a vector of filter coefficientsadapted to be acted upon and changed by the adaptive means outputsignal.
 8. An adaptive canceller filter module means as in claim 7,wherein said filter means comprise an adaptive Finite Impulse ResponseFilter chip.
 9. The method of minimizing a measure of error in an activenoise cancellation system including a first delay line, a vector offilter coefficients, an adapter and a second delay line,comprising:providing a reference signal, shifting the contents of saidfirst delay line one place; placing the reference signal in the initialposition in the delay line; convoluting a current values of the filtercoefficients with the first delay line to produce an output value;shifting the contents of the second delay line one place; placing thecurrent input to the second delay line in the initial position thereof;and adapting the current values of the filter coefficients according tothe adaption algorithm using the current values of the filtercoefficients, the contents of the second delay line and an adaption rateto assure convergence.
 10. A method as in claim 9, wherein convolutingthe current values of the filter coefficients is performed according tothe formulas: ##EQU5##